Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Test and fix the NULL handling of the String functions #68379

Merged
merged 14 commits into from
Feb 5, 2021

Conversation

palesz
Copy link
Contributor

@palesz palesz commented Feb 2, 2021

Fixed the inconsistencies regarding NULL argument handling.
NULL literal vs NULL field value as function arguments in some case
resulted in different function return values.

Functions should return with the same value no matter if the argument(s)
came from a field or from a literal.

The introduced integration test tests if function calls with same
argument values (regardless of literal/field) will return with the
same output (also checks if newly added functions are checked
by the test).

Fixed the following functions:

  • Insert: NULL start, length and replacement arguments (as fields) also
    result in NULL return value instead of returning the input.
  • Locate: NULL pattern results in NULL return value
  • Replace: NULL pattern and replacement results in NULL instead of
    returning the input
  • Substring: NULL start or length results in NULL instead of returning
    the input

Fixes #58907

Fixed the inconsistencies regarding NULL argument handling.
NULL literal vs NULL field value as function arguments in some case
resulted in different function return values.

Functions should return with the same value no matter if the argument(s)
came from a field or from a literal.

The introduced integration test tests if function calls with same
argument values (regardless of literal/field) will return with the
same output (also checks if newly added functions are added to the
testcases).

Fixed the following functions:
* Insert: NULL start, length and replacement arguments (as fields) also
result in NULL return value instead of returning the input.
* Locate: NULL pattern results in NULL return value, NULL start
parameter results 0 return value (0 = no match) instead of NULL
* Replace: NULL pattern and replacement results in NULL instead of
returning the input
* Substring: NULL start or length results in NULL instead of returning
the input

Fixes elastic#58907
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

Comment on lines 350 to 351
false |83
true |13
Copy link
Contributor Author

@palesz palesz Feb 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially curious about your opinions on this one:

These changes are the result of start=NULL{f} argument handling (this is an explicit start=NULL vs leaving out the optional parameter).

Previously:
LOCATE(pattern,input,start=NULL{field}) was handled as LOCATE(pattern,input,start=1) or LOCATE(pattern, input), but LOCATE(pattern,input,start=NULL{literal}) resulted in NULL.

Now in both cases we return with 0 to be in-sync with the MySQL behaviour (where LOCATE comes from).

Possible alternative approach: to stay backwards-compatible and not change query results where the start comes from a field, make LOCATE(pattern,input,start=NULL{literal}) behave the as LOCATE(pattern,input).

Note:
MySQL behaviour: LOCATE(pattern,input,start=NULL) => 0
MS SQL behaviour: CHARINDEX(pattern,input,start=NULL) => NULL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think it would follow the MySQL behaviour since LOCATE comes from there.
If we had to change the behaviour only for a literal NULL I would say to go ahead and provide it as a bug fix, but since now we changes how it behaves for a NULL coming from a field, we should really consider. I guess we can always declare it a breaking change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted to the alternative approach and made the LOCATE(pattern,input,start=NULL{literal}) mimic the current behaviour of the LOCATE(pattern,input,start=NULL{field}) in this PR. Our LOCATE also handles the negative start values differently from MySQL. start=missing/NULL/negative value all result the same as start=1 (tries to find the pattern in the whole input) while MySQL considers start=negative value/NULL as invalid and returns with 0 (no match). Because LOCATE treats NULL as the start argument specially even in MySQL (not the same as any other argument being NULL) I will address these differences in a separate PR.

I am still adding the breaking change label to this PR, since the behaviour of the functions change anyways (NULL field values will start returning NULL values instead of the original input strings).

@mark-vieira
Copy link
Contributor

@elasticmachine update branch

* <p>To ignore any of the tests, add an .ignore() method call after the Fn ctors in the FUNCTION_CALLS_TO_TEST list below, like:
* <code> new Fn("ASCII", "foobar").ignore()</code></p>
*/
public class ConsistentFunctionArgHandlingIT extends JdbcIntegrationTestCase {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for generating the test case to test various combinations however I find the current approach too convoluted.
Instead of indexing the data, filtering the functions and generating queries through an integration test one could use the FunctionRegistry directly and just focus on the string functions either by declaring their arguments as in this test or inspecting the returned definitions.

Specify the arguments or generate the loops based on the signature and pass them to the definition to get back the actual function instance to fold.

The resulting code should be smaller, without strings and faster due to be being a unit test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that writing a unit test would result in faster tests, however testing if:

  1. processors
  2. Function.fold() (optimizer folding rule that calls that if all args are foldable, calls the processor)
  3. optimizer rules that rewrite the function call in the query (NULL folding for example that depends on Function.nullable())
  4. QueryTranslator output (function calls turn into painless scripts that call the processors within Query DSL queries)

all work together in a consistent way providing the same outcome regardless the argument sources being field/literal is easier through integration tests, because the different argument source combinations will hit all of the above code paths using the same test code.

I agree though that in unit tests like LocateProcessorTests (or one similar to this in the PR) we could easily test if fold() and the processor leads to the same results, and likely it'll catch some of the bugs. But it will not cover #3 and #4.

To cover #3 we would have to make sure we call the Optimizer too, check if the function transformed by the rules will provide an output equivalent to fold and processor calls.
To cover #4 we would have to execute the query generated by the QueryTranslator (or at least the generated Painless script), which I think is easier to do through integration tests and would require quite a bit of mocking in a unit test.

This looks too much of a whitebox testing for me. I'd rather execute the same code that the user's code would execute with less assumptions about our implementation. IMO it is easier to argue about the coverage of this integration test that uses the same approach to hit all the four cases above.

Discovery: SqlFunctionRegistry is not accessible in integration test (not impossible to fix), but even if it would be (or would want to use in a unit test): I don't see how can I easily figure out the type of the arguments from the FunctionDefinition. The constructor arguments of the functions are Expressions without hints about the actual accepted types and the Builders also don't provide help. One way is to construct a function with generic/widely unacceptable arguments, call resolveType() and parse the error messages. It would help if the functions would be self-decribing (we could even generate a function arguments meta table with type, optional, accepted sources per function that might be useful somewhere else), but it is outside of the scope of this PR.

@@ -80,6 +81,11 @@ protected Pipe makePipe() {
return NodeInfo.create(this, Locate::new, pattern, input, start);
}

@Override
public Nullability nullable() {
return Nullability.FALSE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the function returns null if any argument is null, this should be TRUE not FALSE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOCATE does not return NULL if start=NULL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the other parameters?

Copy link
Contributor Author

@palesz palesz Feb 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any of the pattern or input is NULL, the result is NULL. Only start is special.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the function is nullable as oppose to never nullable. Depending on the params (outside start) it can TRUE or UKNOWN - see Nullability.and.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, changed.

@costin
Copy link
Member

costin commented Feb 3, 2021

Not only the functions need to be checked but also the underlying processors which are called for performing extractions or working on painless. This is needed to make sure that passing nulls to processors works the same as passing nulls to the functions.

While the processors are tested separately, it's worth looking into a test that checks both with the same set of arguments. At that point, manually declaring the functions and the processors plus their args might be easier than going for the discovery approach.

@matriv
Copy link
Contributor

matriv commented Feb 3, 2021

@palesz With the opportunity of this fix, please adjust the docs for those functions, where currently seem to not mention (apart from LOCATE) the edge cases with null args.

@palesz
Copy link
Contributor Author

palesz commented Feb 3, 2021

@palesz With the opportunity of this fix, please adjust the docs for those functions, where currently seem to not mention (apart from LOCATE) the edge cases with null args.

See 7b8de11 of this PR. Clarified the edge cases on CONCAT and LOCATE (current behavior kept in this PR). The other functions don't have edge cases, any NULL args will return NULL, so left their doc untouched.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one question mainly, the second one can be left unanswered as I'm not too convinced the alternative is better.
@costin has a point about the IT tests being maybe a bit too much for what they are achieving. There are ~500 queries being executed when the tests run. If the impact on the execution time is not too much, I'd keep them.
Otherwise LGTM.

this.name = name;
this.arguments = new ArrayList<>();
for (Object a : arguments) {
this.arguments.add(Argument.class.isAssignableFrom(a.getClass()) ? (Argument) a : new Argument(a));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When can there already be an Argument sent as a parameter to this constructor? I am only seeing non-Argument instances being used....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously I used for one of the cases (LOCATE). Now this is unused and can be removed, but might come handy if we have a restriction that some of the arguments cannot come from fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@palesz I would suggest removing it and use it when needed. Someone looking at the code gets confused about a functionality that's not actually there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

// first set the same value, so the mapping is populated for the null columns
testDoc.put(nullArgPrefix + idx, arg.exampleValue);
}
index(indexName, functionName, body -> body.mapContents(testDoc));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it easier (would definitely be less indexing) to use a mapping to create the index rather than having the index mapping be auto-detected by ES at the first document being indexed?
On the other hand, using a fixed mapping is kind of messy... there are a lot of fields in it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking along the same lines. It would definitely be more code.

@matriv
Copy link
Contributor

matriv commented Feb 4, 2021

@palesz With the opportunity of this fix, please adjust the docs for those functions, where currently seem to not mention (apart from LOCATE) the edge cases with null args.

See 7b8de11 of this PR. Clarified the edge cases on CONCAT and LOCATE (current behavior kept in this PR). The other functions don't have edge cases, any NULL args will return NULL, so left their doc untouched.

Thx! Though, myself, I'd prefer to explicitly state what's the behaviour with NULL in all cases, like we do for example in: https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-functions-datetime.html#sql-functions-datetime-datetimeparse:

Description: Returns a datetime by parsing the 1st argument using the format specified in the 2nd argument.
The parsing format pattern used is the one from java.time.format.DateTimeFormatter. 
*If any of the two arguments is null or an empty string null is returned*.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@palesz
Copy link
Contributor Author

palesz commented Feb 4, 2021

@costin has a point about the IT tests being maybe a bit too much for what they are achieving. There are ~500 queries being executed when the tests run.

@astefan See my previous comment #68379 (comment) about the reasoning behind integration tests.

If the impact on the execution time is not too much, I'd keep them.

The impact on execution time is about ~13 seconds. This test runs as part of the ci/2 that takes ~60 minutes.

@palesz
Copy link
Contributor Author

palesz commented Feb 4, 2021

Thx! Though, myself, I'd prefer to explicitly state what's the behaviour with NULL in all cases

As discussed, will handle in #68531

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

return input;
}
if ((replacement instanceof String || replacement instanceof Character) == false) {
if (!(replacement instanceof String || replacement instanceof Character)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: was the change intentional (here and one more spot in the patch)?
(Myself I dislike == false, but since this the direction the train is going, I'm comfier with a consistent style.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out. No, it wasn't, I think just IDEA tried to be helpful (will revert this one).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted.

@palesz palesz requested a review from astefan February 4, 2021 16:45
this.name = name;
this.arguments = new ArrayList<>();
for (Object a : arguments) {
this.arguments.add(Argument.class.isAssignableFrom(a.getClass()) ? (Argument) a : new Argument(a));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@palesz I would suggest removing it and use it when needed. Someone looking at the code gets confused about a functionality that's not actually there.

assertTrue(retVal.next());
results.add(tuple(functionName + "(" + join(", ", functionCallArgsForAssert) + ")", retVal.getObject(1)));
// only a single row should be returned
assertFalse(retVal.next());

Stream<Object> returnValueStream = results.stream().map(Tuple::v2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this and now go over the stream components twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly streams can only be iterated once, during the second iteration it gave stream is already closed exception.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did the tests pass then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the second iteration only happens if the test finds an issue (to create the detailed message for error).

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

static {
try {
Class<?> c = ConsistentFunctionArgHandlingIT.class;
NON_TESTED_FUNCTIONS = Files.readAllLines(Path.of(c.getResource(c.getSimpleName() + "-non-tested-functions.txt").toURI()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if it's possible get all the functions from FunctionRegistry:
Something like: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/sql/src/test/java/org/elasticsearch/xpack/sql/planner/QueryTranslatorTests.java#L2199

Can also be done at a later point and not as part of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SqlFunctionRegistry is not available in integration tests (can add it as a dependency, but I don't think we really need it). I get the list of the functions from SHOW FUNCTIONS see here, that achieves pretty much the same that SqlFunctionRegistry would (for more see the last paragraph here). The purpose of the NON_TESTED_FUNCTIONS list is to identify functions added after writing this test class which likely we should think about adding to the tested function list / non tested function list (one-liner). Newly added = SHOW FUNCTIONS \ FUNCTIONS_CALLS \ NON_TESTED_FUNCTIONS should be an empty set. The NON_TESTED_FUNCTIONS list should shrink later once the datetime, ... functions are also added to the tested function list (separate PR).

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@palesz palesz merged commit a3dbdae into elastic:master Feb 5, 2021
palesz pushed a commit to palesz/elasticsearch that referenced this pull request Feb 8, 2021
…68379)

Fixed the inconsistencies regarding NULL argument handling.
NULL literal vs NULL field value as function arguments in some case
resulted in different function return values.

Functions should return with the same value no matter if the argument(s)
came from a field or from a literal.

The introduced integration test tests if function calls with same
argument values (regardless of literal/field) will return with the
same output (also checks if newly added functions are added to the
testcases).

Fixed the following functions:
* Insert: NULL start, length and replacement arguments (as fields) also
result in NULL return value instead of returning the input.
* Locate: NULL pattern results in NULL return value, NULL optional start
argument handled the same as missing start argument
* Replace: NULL pattern and replacement results in NULL instead of
returning the input
* Substring: NULL start or length results in NULL instead of returning
the input

Fixes elastic#58907
palesz pushed a commit that referenced this pull request Feb 8, 2021
…68684)

Fixed the inconsistencies regarding NULL argument handling.
NULL literal vs NULL field value as function arguments in some case
resulted in different function return values.

Functions should return with the same value no matter if the argument(s)
came from a field or from a literal.

The introduced integration test tests if function calls with same
argument values (regardless of literal/field) will return with the
same output (also checks if newly added functions are added to the
testcases).

Fixed the following functions:
* Insert: NULL start, length and replacement arguments (as fields) also
result in NULL return value instead of returning the input.
* Locate: NULL pattern results in NULL return value, NULL optional start
argument handled the same as missing start argument
* Replace: NULL pattern and replacement results in NULL instead of
returning the input
* Substring: NULL start or length results in NULL instead of returning
the input

Fixes #58907

(cherry-pick from a3dbdae)
@palesz palesz deleted the string-fn-null-handling branch February 8, 2021 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SQL: fix string functions handling of NULL parameter values
8 participants